Estimating evolutionary distances between genomic sequences from spaced-word matches
نویسندگان
چکیده
منابع مشابه
Estimating Evolutionary Distances between Sequences
These notes accompany a lecture at the summer school on mathematics for bioinformatics, Centre de Recherches Mathematiques, Montreal, August 2003. I develop basic Markov models for sequence evolution and show how these may be used to estimate evolutionary divergences between sequences. I then discuss extensions of the basic model to situations where the evolutionary rate varies over time and fo...
متن کاملApproximate Word Matches between Two Random Sequences
Given two sequences over a finite alphabet L, the D2 statistic is the number of m-letter word matches between the two sequences. This statistic is used in bioinformatics for expressed sequence tag database searches. Here we study a generalization of the D2 statistic in the context of DNA sequences, under the assumption of strand symmetric Bernoulli text. For k <m, we look at the count of m-lett...
متن کاملAPPROXIMATE WORD MATCHES BETWEEN TWO RANDOM SEQUENCES By Conrad
Given two sequences over a finite alphabet L, the D2 statistic is the number of m-letter word matches between the two sequences. This statistic is used in bioinformatics for expressed sequence tag database searches. Here we study a generalization of the D2 statistic in the context of DNA sequences, under the assumption of strand symmetric Bernoulli text. For k <m, we look at the count of m-lett...
متن کاملEstimating phylogenetic distances between genomic sequences based on the length distribution of k-mismatch common substrings
Various approaches to alignment-free sequence comparison are based on the length of exact or inexact word matches between two input sequences. Haubold et al. (2009) showed how the average number of substitutions between two DNA sequences can be estimated based on the average length of exact common substrings. In this paper, we study the length distribution of k-mismatch common substrings betwee...
متن کاملFast and accurate phylogeny reconstruction using filtered spaced-word matches
Motivation Word-based or 'alignment-free' algorithms are increasingly used for phylogeny reconstruction and genome comparison, since they are much faster than traditional approaches that are based on full sequence alignments. Existing alignment-free programs, however, are less accurate than alignment-based methods. Results We propose Filtered Spaced Word Matches (FSWM) , a fast alignment-free...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Algorithms for Molecular Biology
سال: 2015
ISSN: 1748-7188
DOI: 10.1186/s13015-015-0032-x